19 research outputs found

    Disinformation Capabilities of Large Language Models

    Full text link
    Automated disinformation generation is often listed as one of the risks of large language models (LLMs). The theoretical ability to flood the information space with disinformation content might have dramatic consequences for democratic societies around the world. This paper presents a comprehensive study of the disinformation capabilities of the current generation of LLMs to generate false news articles in English language. In our study, we evaluated the capabilities of 10 LLMs using 20 disinformation narratives. We evaluated several aspects of the LLMs: how well they are at generating news articles, how strongly they tend to agree or disagree with the disinformation narratives, how often they generate safety warnings, etc. We also evaluated the abilities of detection models to detect these articles as LLM-generated. We conclude that LLMs are able to generate convincing news articles that agree with dangerous disinformation narratives

    Automated, not Automatic: Needs and Practices in European Fact-checking Organizations as a basis for Designing Human-centered AI Systems

    Full text link
    To mitigate the negative effects of false information more effectively, the development of automated AI (artificial intelligence) tools assisting fact-checkers is needed. Despite the existing research, there is still a gap between the fact-checking practitioners' needs and pains and the current AI research. We aspire to bridge this gap by employing methods of information behavior research to identify implications for designing better human-centered AI-based supporting tools. In this study, we conducted semi-structured in-depth interviews with Central European fact-checkers. The information behavior and requirements on desired supporting tools were analyzed using iterative bottom-up content analysis, bringing the techniques from grounded theory. The most significant needs were validated with a survey extended to fact-checkers from across Europe, in which we collected 24 responses from 20 European countries, i.e., 62% active European IFCN (International Fact-Checking Network) signatories. Our contributions are theoretical as well as practical. First, by being able to map our findings about the needs of fact-checking organizations to the relevant tasks for AI research, we have shown that the methods of information behavior research are relevant for studying the processes in the organizations and that these methods can be used to bridge the gap between the users and AI researchers. Second, we have identified fact-checkers' needs and pains focusing on so far unexplored dimensions and emphasizing the needs of fact-checkers from Central and Eastern Europe as well as from low-resource language groups which have implications for development of new resources (datasets) as well as for the focus of AI research in this domain.Comment: 41 pages, 13 figures, 1 table, 2 annexe

    Is it indeed bigger better? The comprehensive study of claim detection LMs applied for disinformation tackling

    Full text link
    This study compares the performance of (1) fine-tuned models and (2) extremely large language models on the task of check-worthy claim detection. For the purpose of the comparison we composed a multilingual and multi-topical dataset comprising texts of various sources and styles. Building on this, we performed a benchmark analysis to determine the most general multilingual and multi-topical claim detector. We chose three state-of-the-art models in the check-worthy claim detection task and fine-tuned them. Furthermore, we selected three state-of-the-art extremely large language models without any fine-tuning. We made modifications to the models to adapt them for multilingual settings and through extensive experimentation and evaluation. We assessed the performance of all the models in terms of accuracy, recall, and F1-score in in-domain and cross-domain scenarios. Our results demonstrate that despite the technological progress in the area of natural language processing, the models fine-tuned for the task of check-worthy claim detection still outperform the zero-shot approaches in a cross-domain settings.Comment: 27 pages, 10 figure

    Comparison between parameter-efficient techniques and full fine-tuning: A case study on multilingual news article classification

    Full text link
    Adapters and Low-Rank Adaptation (LoRA) are parameter-efficient fine-tuning techniques designed to make the training of language models more efficient. Previous results demonstrated that these methods can even improve performance on some classification tasks. This paper complements the existing research by investigating how these techniques influence the classification performance and computation costs compared to full fine-tuning when applied to multilingual text classification tasks (genre, framing, and persuasion techniques detection; with different input lengths, number of predicted classes and classification difficulty), some of which have limited training data. In addition, we conduct in-depth analyses of their efficacy across different training scenarios (training on the original multilingual data; on the translations into English; and on a subset of English-only data) and different languages. Our findings provide valuable insights into the applicability of the parameter-efficient fine-tuning techniques, particularly to complex multilingual and multilabel classification tasks

    A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts

    Full text link
    In the realm of text manipulation and linguistic transformation, the question of authorship has always been a subject of fascination and philosophical inquiry. Much like the \textbf{Ship of Theseus paradox}, which ponders whether a ship remains the same when each of its original planks is replaced, our research delves into an intriguing question: \textit{Does a text retain its original authorship when it undergoes numerous paraphrasing iterations?} Specifically, since Large Language Models (LLMs) have demonstrated remarkable proficiency in the generation of both original content and the modification of human-authored texts, a pivotal question emerges concerning the determination of authorship in instances where LLMs or similar paraphrasing tools are employed to rephrase the text. This inquiry revolves around \textit{whether authorship should be attributed to the original human author or the AI-powered tool, given the tool's independent capacity to produce text that closely resembles human-generated content.} Therefore, we embark on a philosophical voyage through the seas of language and authorship to unravel this intricate puzzle

    Unravelling the basic concepts and intents of misbehavior in post-truth society

    Get PDF
    Objective: To explore the definitions and connections between the terms misinformation, disinformation, fake news, rumors, hoaxes, propaganda and related forms of misbehavior in the online environment. Anotherobjective is to infer the intent of the authors, where relevant.Design/Methodology/Approach: A conceptual analysis of three hundred fifty articles or monographies from all types of disciplines with a priority of the articles focused on terminological analysis was being utilized. A conceptual map of the terminology that is relevant to the post-truth era was created. In the case of the lack of agreement, the etymology of the terms, utilizing dictionaries, terminological databases and encyclopedias,was favored.Results/Discussion: The approach made possible to delimit the borders between the core terms of posttruth society and to classify them according to the intents of the authors: power (influence), money, fun, sexual harassment, hate/discord, ignorance, passion and socialization. These features were identified to be able to differentiate the concepts: falsity (misleadingness, deceptiveness, lack of verification), accuracy, completeness, currency, medium, intent and analyzable unit. The conceptual map, summarizing and visualizing our findings is attached in the article.Conclusions: We argued that disinformation and misinformation are different terms with different authors and intents in the online environment. Likewise, fake news was delimitated as species of disinformation, which is limited by the medium and financial intent. The intent of hoaxers is rather the amusement of the authors or to spread discord between different groups of society. The intent and analyzable units as statement, claim, article, message, event, story and narrative that were identified in the literature, are crucial for the understanding and communication between social (human) scientists and computer scientists in order to better detect and mitigate various types of false information.Originality/Value: The study provides a theoretical background for detecting, analyzing and mitigating false information and misbehavior

    Tracing Strength of Relationships in Social Networks

    No full text
    Current web is known as a space with constantly growing interactivity among its users. It is changing from the data storage into a social interaction place where people not only search interesting information, but also communicate and collaborate. Obviously, social networks are the most used places for common interaction among people. We present a method for analysis of the strength of relationships together with their evolution. This method is based on the various user activities in social networks. We evaluate our approach within the Facebook social network. 1. Introduction an
    corecore